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Abstract 

In this paper we investigate the word problem of the free Burnside 
semigroup satisfying x 2 = x 3 and having two generators. Elements 
of this semigroup are classes of equivalent words. A natural way to 
solve the word problem is to select a unique "canonical" representative 
for each equivalence class. We prove that overlap-free words and so- 
called almost overlap-free words (this notion is some generalization of 
the notion of overlap-free words) can serve as canonical representatives 
| for corresponding equivalence classes. We show that such a word in a 

given class, if any, can be efficiently found. As a result, we construct 
, a linear-time algorithm that partially solves the word problem for the 

Ph ' semigroup under consideration. 
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Introduction 



The free Burnside semigroups satisfying defined by the 

identities, which impose the equivalence of n-th and (n+m)-th powers of 
words in any context. Thus elements of these semigroups are classes of 
equivalent words. The structure of free Burnside semigroups is far from 
being completely described. However, a considerable progress was achieved 
v/-) . in the 1990's. Kadourek and Polak [6], de Luca and Varricchio [10], McCam- 

mond [TT], Guba (HE], and do Lago [HE] produced a series of papers that led 
^ \ to the discovery of many structural properties of free Burnside semigroups. 

The reader is referred to the survey [H] for the history and the formulations 
of remarkable results. 

The word problem is probably the most important and challenging combi- 
natorial problem related to free Burnside semigroups. It is formulated as fol- 
lows: given words U and V , decide whether or not U and V are equivalent 
in a given semigroup. In a series of papers [U EJ [7J [TUl ED] the word problem 
was solved for all free Burnside semigroups satisfying x n = x n+m for n > 3 
and m > 1. In this case, all equivalence classes are regular languages, and 
the deciding algorithm for the word problem constructs an automaton rec- 
ognizing the class of the word U and tries to accept V by this automaton 
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(see |H E]). Due to Green and Rees [3] and later Kadourek and Polak [6], 
the word problem for the case n = 1 was solved modulo periodic groups (i. 
e., reduced to the word problem for the groups satisfying x m = 1). For the 
case n = 2, this problem remains open (and some equivalence classes are not 
regular languages, see [8]). Note that the word problem for the particular 
case n = 2 and m = 1 was explicitly formulated by Brzozowski [2]. This case 
was considered to be the hardest one to analyze (see [7j). In what follows, 
we consider the free Burnside semigroup satisfying x 2 = x 3 and having two 
generators. 

A natural way to solve the word problem is to select a unique "canonical" 
representative for each equivalence class. Thus every word is equivalent to 
exactly one "canonical" word. If the latter can be efficiently found, then 
the word problem is decidable. It is clear that the choice of the canonical 
representatives is not an easy task. For example, we cannot take just a cube- 
free word as a representative, since there exist equivalent cube-free words. 
And if we take the shortest word in the class as a representative, it may be 
very hard to determine such a word. 

As was proved in [13], overlap-free words can serve as canonical repre- 
sentatives for corresponding equivalence classes. In this paper we generalize 
this result for "almost" overlap-free words and show that such a word in a 
given class, if any, can be efficiently found. Thus we construct an efficient 
(in fact, linear-time) algorithm that partially solves the word problem for the 
semigroup under consideration. 

To give precise formulations of the main results, we say a few words about 
definitions and notation. 

Let E = {a, b}. As usual, we write E* for the monoid of all words over E 
(including the empty word A) and E + for the semigroup of all non-empty 
words over S. For a word W, its length is denoted by \W\ and its i-th letter 
is denoted by W[i]; thus, W = W[l] . . .W[\W\]. In the sequel, we write 
W[i . . .j] instead of W[i] . . . W[j]. Factors, prefixes, suffixes, and powers of 
a word are defined in the usual way. Recall that the Kleene star W* of W is 
the union of all nonnegative powers of the word W. 

A factor, prefix, or suffix of a word W is called proper if it is not equal 
to W. A factor of a word W is called internal if it is neither a prefix nor a 
suffix of W. We write U < V (U < V, U >C V) if a word U is a factor (resp., 
proper factor, internal factor) of a word V. 

A word U is called overlap-free if it contains no factor of the form XYXYX 
for any X G E + , Y G E*. If U contains no proper factor of the form above, 
then we call it almost overlap-free. Finally, we call a word cube-free if it 
contains no factor of the form XXX for any X G E + . 

Two words U and V are neighbours if and only if one of them can be 
obtained from another one by replacing some factor of the form Y 2 by the 
factor Y 3 . Thus we have the neighbourhood relation 

it = {(XY 2 Z, XY 3 Z), (XY 3 Z, XY 2 Z) \ X, Z G E*, Y G E+}. 

The free Burnside semigroup satisfying x 2 = x 3 generated by E is defined as 
the quotient semigroup E + /~, where ~ is the smallest congruence contain- 
ing 7r. If U ~ V, then the words U and V are said to be equivalent. The 
congruence class of U is denoted by [U]. 
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In these terms, the first result of this paper is formulated as follows. 

Theorem 1. Except for the classes [aa] and [bb], each class of the con- 
gruence ~ contains at most one almost overlap-free word. Each of the two 
exceptional classes contains exactly two almost overlap-free words. 

Theorem [1] for overlap- free words was proved in [13] . In this paper the 
proof from [T3] is simplified and applied to a more general case. 

The second result of this paper now follows. 

Theorem 2. If at least one of words U and V is equivalent to an almost 
overlap-free word, then the word problem for the pair (U, V) can be solved in 
time 0(n), where n = m&x{\U\, \V\}. 

In fact, we will construct Algorithm EqAOF (abbr. Equivalent Almost 
Overlap- Free) , which returns the almost overlap-free word V that is equiva- 
lent to a given input word U or reports that no such word V exists. It should 
be mentioned that some equivalence classes contain no almost overlap-free 
words like the class [ababaa] = (ab)*ababaa. 

Sketches of the proofs of Theorems [1] and [2] were given in [12] . Here we 
present a full version of these proofs. 

The text is subdivided into six sections. In Sect. [T]we introduce the main 
tools and techniques. Sect. |2] contains the proof of Theorem [1] Last four 
sections are devoted to the construction and analysis of Algorithm EqAOF. 

1 The main tools and techniques 

Recall that Thue-Morse morphism (p of S + is defined by the rule 

(p(a) = ab, ip{b) = ba. 

A (p-image is any word U G <^(E + ). 
One can easily check 

Observation 1.1. The set y?(£ + ) consists exactly of all even-length words 
such that all factors aa or bb start at even positions. 

The main idea of our solution to the word problem for an instance (U, V) 
is to simplify and shorten the words U and V. Each simplification is either 
an equivalent transformation of a word or a simultaneous transformation 
of a pair of words, preserving their equivalence/non-equivalence. The main 
instrument is the function ip" 1 applied to a pair of words. This function 
reduces the length of both words to one half. All other transformations are 
needed to get a pair (U', V) of (^-images U' and V . These transformations 
are 

- complete reduction of jaword, which is an equivalent transformation on 
the class of so-called AB-whole words; 

- tail reductions applied to a pair of words in order to make the words 
AB-whole; 

- functions £ and r] applied to a pair of completely reduced words in order 
to turn them into yj-images. 

In this preliminary section we introduce all these transformations and 
study their properties. 
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1.1 Uniformity and complete reduction. Following [TJ], we gen- 
eralize the notion of y?-image. Namely we say that a word U is uniform if all 
its factors aa or bb start in U either always at even positions or always at odd 
positions. Otherwise a word is called non-uniform. We call a word letter- 
alternating if it contains no factor aa or bb. In what follows, A (resp., B) 
abbreviates an arbitrary letter-alternating word of the form aba(ba)* (resp., 
bab(ab)*). All letter-alternating words are obviously uniform. 

Uniformity plays a crucial role in subsequent considerations. First, uni- 
form words form a vast majority among all almost overlap-free words. Sec- 
ond, an important connection between the uniformity, Thue-Morse mor- 
phism, and the congruence ~ was established in [JJ: 

Proposition 1.1. Suppose W ~ f(U) for some U G S + and a uniform 
word W . Then there exists a word V G S + such that W = <p(V) and V ~ U . 

According to Proposition since Thue-Morse morphism preserves the 
congruence ~, we have tp(V) ~ <p{U) if and only if V ~ U. Thus the word 
problem for two (^-images U and V can be reduced to the word problem for 
the pair of shorter words ip^iJJ) and ip^iV). So, we are going to replace 
considered words by ^-images whenever it is possible. First, we introduce 
three reduction operations in order to transform any given word to a uniform 
word (see [T]). 

The first operation is called ri-reduction. It reduces all factors of the 
form c n to c 2 , where c G S and n > 2. The result of this operation applied 
to a word U is denoted by r\(U). We say that a word U is r\-reduced if 

u = n(U). 

Obviously, U ~ fi{U) for any U G S + . Moreover, rx-reduction preserves 
the relation 7r as well: (U,V) G n implies (ri(U), ri(V)) G 7rorr 1 (f/) = ri(V) 
(see pQ ) . Thus if we denote by 7r ri and ~ ri the restrictions of relations ir and 
~ respectively to the set of all ri-reduced words, then we get ~n= ^X- 
Therefore it is sufficient to solve the word problem for ri-reduced words only. 
For this reason, in the sequel we usually consider rx-reduced words. 

Let us denote the class of all rx-reduced words that are equivalent to U 
by [U] n . Obviously, all almost overlap-free words, except for the words aaa 
and bbb, are ri-reduced. Hence, Theorem [1] can be reformulated as follows: 

(*) for any word U, the class [U] ri contains at most one almost overlap-free 
word. 

Actually we will prove Theorem [T] in this form. 

Now let U be an rx-reduced word. Then Ta(U) is the word obtained from 
U by performing all possible reductions of the form 

aAa — > aa. (1.1a) 

The word tb(U) is defined in a symmetric way using reductions of the form 

bBb bb. (1.1b) 

Finally, let r(U) = rs(rA(ri(C/))) for an arbitrary word U G S*. The 
operation r is called complete reduction. We call a word U completely reduced 
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if r(U) = U. As it was shown in [I], the word r(U) can be obtained from 
T\{U) by performing all possible reductions of the form (II. lap and (ll.lbp in 
any order. 

It is easy to check that, in contrast to ^-reductions, some complete re- 
ductions do not preserve the congruence ~. However, as we will see later, 
under certain conditions even the complete reduction preserves ~. 

It appears that completely reduced words are exactly uniform ones: 

Proposition 1.2. The following three conditions are equivalent for an 
arbitrary word U: 

(1) U is a factor of a ip-image; 

(2) U is uniform; 

(3) U is completely reduced. 

The equivalence of (2) and (3) was proved in [13], while the equivalence 
of (1) and (2) follows from definitions. 

1.2 Reduction of v4I?-whole words. Non-reducible tails. So, we 

can reduce any word U to the uniform word r(U). Proposition 11.41 be- 
low states that words U and r(U) are equivalent under certain conditions. 
This proposition uses the important notions of A-whole and 5-whole words 
(see [Tj). An r^-reduced word W is called A-whole if every factor X of the 
form aAa occurs in W inside the factor abXba. The notion of B-whole word 
isjiual to the above one. If a word is A-whole and 5-whole, it will be called 
A5-whole. Obviously, any completely reduced word is v4I?-whole. 
In the sequel we often use the following proposition proved in pQ. 

Proposition 1.3. If r\- reduced words U and V are equivalent and U is 
A-whole (B-whole), then V is A-whole (resp., B-whole) as well. 

Now we ready to formulate Proposition 11.41 

Proposition 1.4. Let U be an AB-whole word. Then U ~ r(U) if and 
only if U has no prefix of the form (aba)(aba)*(ab) 2 (ab)*aa and no suffix of 
the form aa(ba)* (ba) 2 (aba)* (aba) up to negation. 



We refer to the prefixes and suffixes mentioned in Proposition 11.41 as 
non-reducible tails and distinguish four kinds of such tails according to the 
following table: 

A-tail B-tail 
left (prefix) (aba)(aba)*(ab) 2 (ab)*aa (bab)(bab)*(ba) 2 (ba)*bb 
right (suffix) aa(ba)* (ba) 2 (aba)* (aba) bb(ab)*(ab) 2 (bab)*(bab) 



In order to prove Proposition 11.41 we need an auxiliary result. 

Lemma 1.1. Let U and V be r\-reduced words, U ~ V, and let U have 
a non-reducible tail. Then the word V has a non-reducible tail of the same 
kind. 

Proof. It is sufficient to prove the statement of the lemma for a pair of 
neighbours. Let U = XY k Z, V = XY l Z, where X,Z E £*, Y E £+, and 
{k, 1} = {2,3}. Without loss of generality assume that U has a left A-tail, 
that is, U = (aba) n (ab) m aaT, where n > 1, m > 2, and T E S*. The words 
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U and V have the common prefix XYY, therefore the proof is evident if 
(aba) n (ab) m aa is a prefix of the word XYY. So suppose that XYY is a 
proper prefix of the word (aba) n (ab) m aa. Consider all possible cases. 

First suppose that \X\ > \(aba) n \. In this case, YY < (ab) m a. Hence Y is 
a letter-alternating word of even length. Since Y k is also letter-alternating, 
we have Y k < (ab) m a. Then the word V has the prefix (aba) n Sa for some 
neighbour S of the word (ab) m a. Thus V has a left v4-tail, as desired. 

Now let \X\ < |(a6a) n | and \XY\ > \(aba) n \. Then Y contains the factor 
aa obtained from the last letter of (aba) n and the first letter of (ab) m a. On 
the other hand, the suffix Y of XYY is letter-alternating as a factor of the 
word (ab) m a. We get a contradiction. Hence, this case is impossible. 

Finally, let \XY\ < \(aba) n \. Then the word Y contains no factors abab 
and baba. Thus, XY 2 < (aba) n (aba) whence the word Y 2 also contains no 
factors abab and baba, in particular, Y {ab, ba}. Since the words U and V 
are rx-reduced, we get Y j£ {a,b,aa,bb}. Therefore, \Y\ > 2 and the factors 
abab and baba can occur in Y k inside the factor Y 2 only. Hence the word 
Y k contains no factors abab and baba as well, and we have XY k < (aba) n+1 . 
So the word V has the prefix Sb(ab) m ~ 2 aa for some neighbour S of the word 
(aba) n+1 . One can easily check that the prefix Sb(ab) m ~ 2 aa is a left A-tail. 
This completes the proof of the lemma. □ 

Proof of Proposition ll.4[ Let a word U be AB-whole and have no 
non-reducible tails. Recall that reductions of kind (jl.lal) and (jl.lbl) can be 
applied in any order. So, to prove the forward implication it is sufficient to 
find a sequence of reductions from U to r(U) such that every single reduction 
preserves the relation ~. Indeed^by Proposition ll.3[ the word obtained by 
a single reduction will remain AB- whole and, by Lemma this word will 
have no non-reducible tails. 

We will reduce U as follows: first perform all possible reductions of the 
form aabaa — > aa and bbabb — > bb (in any order) and then all other reductions 
(also, in any order). 

Suppose that U = XaabaaY, and U' = XaaY is obtained from U by the 
reduction aabaa — > aa. Since U is A- whole, we have U = X'abaabaabaY' , 
where X = X'ab and Y = baY'. Thus we get U = X\abafY\ U' = 
X'(aba) 2 Y' whence U ~ U'. A single reduction bbabb — > bb is examined 
symmetrically. 

Now suppose that U contains no factors aabaa and bbabb. Without loss of 
generality assume that U = Xa(ab) k aaY, where k > 2, and let U' be obtained 
from U by the reduction a(ab) k aa — > aa, that is, U' = XaaY. Since U is 
an A-whole word, we can write U = X'aba(ab) k aabaY' , where X = X'ab and 
Y = baY'. If X' = A or Y' = A, then U has a non-reducible left (resp., right) 
tail, which is impossible by the conditions of the proposition. Hence, X' ^ A 
and Y' ^ A. Since the word U contains no factors aabaa and bbabb, we have 
U = X"baba(ab) k aababY". Then 

U = X"(baba)(ab) k a(abab)Y" ~ X" \ba) k+1 {ab) k a{abab) k+1 Y" = 
X"b (ab) k a {ab) k a {ab) k abY" ~ X"b{ab) k a (abfabY" = 
X"(ba) k+1 (ab) k+l Y" ~ X"babaababY" = U' . 

We get U ~ U' , as desired. So, the forward implication is proved. 
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The backward implication (r(U) ~ U implies that U has no non-reducible 
tails) trivially follows from Lemma IL~Tj since r(U) has no non-reducible tails. 

□ 

Our prime interest is in the study of equivalence classes that contain 
almost overlap-free words. Since almost overlap-free words have no non- 
reducible tails, all words from such equivalence classes have no non-reducible 
tails by Lemma 11.11 So, if a word U is AB-who\e, we can replace U by the 
word r(U), which is equivalent to U by Proposition [T3J and thus significantly 
simplify further analysis. The case when a word U is not v4I?-whole will be 
considered below in Subsection 11.41 

1.3 Uniform neighbours and quasi-neighbours. This subsection 
is devoted to the study of uniform neighbours. We already know that ~ ri = 
7T+. Let us denote the restrictions of the relations ~ and n to the set of all 
uniform words by ~, r and ir r respectively. In this subsection we prove the 
following equality: 

Proposition 1.5. ~ r = 7r+. 

First, we give more definitions. We say that words U and V are ab- 
neighbours and write (U,V) G 7r a b if one of them has the form X(ab) 2 Z 
and the other has the form X(ab) 3 Z for some X, Z G S*. Thus we get 
the ab-neighbourhood relation, which obviously preserves the uniformity. We 
call words U and V quasi-neighbours if U = V or there exist two sequences 
{C/j}" =1 and {Vj}??Li of words such that U\ = U, (Ui,Ui+i) G ir a b for each 
% = l,...,n— 1; V\ = V, (Vj,Vj+i) G ir a b for each j = l,...,m— 1; and 
{U n , V m ) G 71. 

The proof of Proposition 11.51 is based on the following lemma. 

Lemma 1.2. Let AB-whole words U and V have no non-reducible tails. 
IfU andV are neighbours, then the words r(U) andr(V) are quasi-neighbours. 

Proof. Without loss of generality let U = XYYZ and V = XYYYZ 
for some X, Z G S* and Y G S + . First, we apply the r-reduction to 
the words XY[1], Y, and Obviously, r(XY[l}) = X'Y[1] and 

r{Y[\Y\]Z) = Y[\Y\]Z' for some X',Z' G S*, and r(Y) begins with Y[l] 
and ends by y[|y|]. Thus we obtain the neighbours X'r(Y)r(Y)Z' and 
X'r{Y)r(Y)r{Y)Z' . Clearly, these words are AB-who\e and have no non- 
reducible tails. So, to simplify notation we may assume that the words 
Xy[l], y, and y[|y|]Z are already r- reduced. Suppose that at least one 
of the words U and V is not r-reduced, and a word P is its factor of the 
form aAa or bBb. Without loss of generality we assume that P = a(ab) k aa 
for some integer k > 0. The word P either contains Y as internal factor 
or occurs inside the factors XY, YY, or YZ. Consider first the cases when 
y < P or P < YY. 

Suppose y P. Then the word Y is letter-alternating. If Y has odd 
length, then the word V = XYYYZ contains the factor y[|y|]yy[l] of the 
form aAa or bBb. The reduction of this factor turns V into U, and the lemma 
readily follows. If the length of Y is even, then the words YY and YYY are 
also letter-alternating. Hence the factor P begins inside X and ends inside 
Z, that is, P = X'Y h Z', where X' is a suffix of X, Z' is a prefix of Z, and 
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either k = 2 and P < U or k = 3 and P < V. Clearly, the words X'YYZ' 
and X'YYYZ' both have the form aAa. Thus we reduce the factor X'YYZ' 
in U and the factor X'YYYZ' in V to get two identical words and prove the 
lemma. 

In the case P < YY, we have P = aY 2 Y x a, where the words Y x a and aY 2 
are respectively a prefix and a suffix of Y such that Y 2 Y\ = (ab) k a. Obviously, 
the prefix Y\ of the word Y does not overlap the suffix Y 2 of Y (otherwise Y\ 
contains the factor aY^l] = aa, which is impossible). Thus, Y = Y{Y'Y 2 for 
some Y' E £*. We have 

c/ = xyiy'(y 2 yi)y'y 2 z and y = XY.Y'^Y^Y'iY^Y'Y^, 

and after the reduction P — > aa we get the neighbours XY{Y'Y'Y 2 Z and 
XY X Y'Y'Y'Y 2 Z. 

Now suppose that the word YY is already r-reduced and Y is not an 
internal factor of P. Then P occurs into both words U and V inside the 
factor XY or YZ. First, assume that YZ is r-reduced. Then P < XY and 
P is a unique factor of the form aAa or bBb in [/ and V. So, P = aX{Y x a, 
where aX 1 is a suffix of X, Y\a is a prefix of Y, Xi E £*, y G S + , and 
Xiy = (a6) fe a. Let y = y^y for some f G E*. Since the words U and V 
are whole, we have X = X'abaX x for some X' e E*. If fc = 1, then 

[/ = X'a6a(a6a)ay'yZ and V = X'a6a(a6a)ay'yyZ. 

After the reduction P ^ aa we obtain the neighbours 

X'aba^aY'YZ = X'X l Y x aY'YZ = X'X X YYZ 

and 

x'^ay'yyz = x% y^y'yyz = x'x 1 yyyz. 

In the case fc > 2, we have X' 7^ A (if X' = A, then both words U and F 
have the non-reducible tail (aba)(ab) k aa, in contradiction with the lemma's 
condition). Since X is r-reduced, the last letter of X' cannot be equal to 
a, therefore X = X"babaX 1 . After the reduction P — > aa we obtain the 
quasi- neighbours 

U' = X"babaaY'YZ and V = X"babaaY'YY Z. 

Indeed, if we put 

Ui = X"ba{ba) i aY'YZ and V { = X"ba{ba) i aY'YY Z 

for % = 1, . . . , k, we get £A = f/ 7 , V x = V , (Ui, U i+1 ) E 7r ab , (V i: V i+1 ) E ir ab 
for each i — 1, . . . , k—1, and the words 

t/fc = X"ba(ba) k aY'YZ = X"bX x Y x aY'YZ = X"bX ± YYZ 



and 



V k = X"ba(ba) k aY'YYZ = X"bX 1 Y x aY' YY Z = X"bX x YYYZ 
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are neighbours. So, since U' = r(U) and V = r(V), the words r(U) and 
r(V) are quasi- neighbours, as desired. 

The case P < YZ is considered in the same way. Finally, if both words 
XY and Y Z are not r-reduced, then there exist two words Pi and P2 of the 
form aAa or bBb such that Pi < XY and P2 < YZ. In this case, after the 
reductions of Pi and P2 we obtain the words r(U) and r(V), which appear 
to be quasi-neighbours. Indeed, we can construct two sequences of words, 
instead of the one in the previous case, in the same way, increasing degree 
of the factor ab or ba in the prefix r(XY) and in the suffix r(YZ) of the 
words r(U) and r(V). These sequences obviously satisfy all conditions from 
the definition of quasi-neighbours. This completes the proof. □ 

We should mention that a weaker version of Lemma [L2l for the case when 
U and V are equivalent to some (p- images was proved in [TJ. 

Proof of Proposition 11.51 We say that a sequence {Ri}^ =0 of words is 
called a linking (U,V) -sequence if Ro = U, R n = V, and (Ri-i,Ri) G % for 
each k = 1, ... ,n. If all the words Ri are r-reduced (r-reduced), we call 
such a sequence an r 1 -linking (resp., an r-linking) (U,V) -sequence. In these 
terms, Proposition 11.51 means that two uniform words are equivalent if and 
only if there exists an r-linking (U, K)-sequence. 

If there exists an r-linking (U, ^-sequence, then U ~ V and both words 
U and V are uniform. Conversely, let U and V be uniform words and U ~ V. 
Then there exists an relinking (U, l / )-sequence {Wk} k=0 . Since the words U 
and V are uniform, all words Wk are AB-whole and have no non-reducible 
tails. Consider the sequence {W' k = r(W k )}l =Q . We have Wq = U ' ,W' n = V, 
and, by Lemma fL2l the words W k _ x and W' k are uniform quasi-neighbours for 
each k — 1, ... ,n. So, the sequence {W k } k=0 is an r-linking (U, ^-sequence, 
as desired. □ 

1.4 Non-uniform almost overlap-free words. Non-uniform tails. 

All results mentioned in this subsection were proved in [13J. In the sequel, 
we often make use of the negation operation , which is a unique non- 
trivial automorphism of S*. For example, tp{W) = <p(W). We also write 
Z = {W I W E £} for any £cE*. 

Suppose that a word U is not AB-who\e and an almost overlap-free word 
V is equivalent to U. Then V is not AB- whole as well by Proposition ll.3j in 
particular, V is non-uniform. Define 

Ai = {aabaa, aabaab, baabaa, baabaab, aabaabb, 

bbaabaa, aabaaba, abaabaa, aabaabbaabaa} , 

and let Si = Ai U Ai- The non-uniform almost overlap- free words can be 
characterized as follows. 

Proposition 1.6. Let V be a non-uniform Ti-reduced almost overlap-free 
word. Then at least one of the following conditions holds: 

1) V e Si; 

2) Up to negation, V has the prefix aabaabba or the suffix abbaabaa. 
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^jSince all words from «Si are almost overlap-free, not uniform, and not 
AB-who\e, we immediately get the following corollary of Proposition 11.61 

Corollary 1.1. Any almost overlap-free word is AB -whole if and only if 
it is uniform. 

The ^-reduced words that are equivalent to one of the prefixes or suffixes 
mentioned in Proposition II .6[ 2) will be called non-uniform tails. There are 
four kinds of such tails: 

left right 
A-tail (aab)*(aab) 2 ba ab(baa) 2 (baa)* 
5-tail (bba)*(bba) 2 ab ba(abb) 2 (abb)* 

The tail reduction operation tt, defined in [13], reduces a left non- uniform 
tail of any word to the last 7 symbols, and a right non-uniform tail to the 
first 7 symbols: 

left right 
reduced A-tail abaabba abbaaba 
reduced 5-tail babbaab baabbab 

Note that a tail of an almost overlap-free word, if any, has exactly 8 
symbols. Hence the operation tt deletes exactly one letter from such a tail. 

In addition to the function tt, we use the functions r l T and rJp, which 
reduce a left (resp., right) non- uniform tail of any given word to the last 
(resp., first) 7 symbols. If a word W has both left and right tails, then 
these tails have at most 6 symbols in common (up to negation, for the word 
W = (aab)*aabaabbabb(abb)*). Hence the operation r l T preserves the right 
tail of W and r r T preserves the left tail of W. Thus we have rx(W) = 
rTp{r l T {W)) = r l T (rJp(W)) for any word W. 

We make one easy observation. 

Observation 1.2. Let U andV be r\-reduced words andrT{U) ~ tt(V). 
Then U ~ V. 

Note that the inverse is not true in general. Surprisingly, under certain 
conditions, the operation tt preserves the congruence ~. 

Proposition 1.7. Suppose that U ~ V for r\-reduced words U and V . 
Then 

1) U and V have non-uniform tails of the same kind, if any; 

2) If U,V <^L [aabaabbaabaa] ri U [bbabbaabbabb] ri and at least one of the 
words rT{U) and /y(V) is AB-whole, then tt(U) ~ rr(V). 

Combining Propositions 1 1.7\ 11.61 11.31 and the definition of r^, we get the 
following corollary. 

Corollary 1.2. Let U ~ V for an r\-reduced word U and an r^-reduced 
almost overlap-free word V. If 'V S\, then rx{U) ~ Tt(V) and the words 
rxiU), rxiV) are AB-whole. 

So, it remains to calculate the equivalence classes of all words from S\. 
The following lemma gives the answer. 
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Lemma 1.3. 1) [aabaa] ri = aabaa; 

2) [aabaab] ri = (aab) 2 (aab)* ; 

3) [baabaa] ri = (baa)* (baa) 2 ; 

4) [baabaab] ri = (baa) 2 (baa)* b; 

5) [aabaabb] ri = (aab) 2 (aab)* b; 

6) [bbaabaa] ri = b(baa)* (baa) 2 ; 

7) [aabaaba] ri = (aab) 2 (aab)* a; 

8) [abaabaa] ri = a(baa)* (baa) 2 ; 

9) [aabaabbaabaa] ri = (aab) 2 (aab)*(b(aab)*aab)*(baa)*(baa) 2 . 

The classes of the words from A\ are negations of the classes l)-9). 

Note that for any word V G S\ the equivalence class [V] n is a regular 
language. Thus the word problem is decidable for any pair (U,V): one 
can build a finite automaton recognizing [V) ri and try to accept U by this 
automaton. 

1.5 Prom uniform words to yj-images: functions £ and rj. Com- 
bining the functions and r, we can transform a given word U to a uniform 
word. Here we introduce the functions £ and rj, which transform any uniform 
word to a <^-image. By Proposition \1.2\ any uniform word U is a factor of a 
f^-image. That is, 

U = cQi . . . Qkd, where Qx, ...,Q k e {ab, ba}, c,d G {a, b, A}. 

This representation is unique if U is not letter-alternating. For letter-alternating 
words we additionally require c = A to get a unique representation as well. 
Now put 

r l (U)=Q 1 ...Q k , t(U)=~cUd. 

Obviously, f](U) is a <£>-image of maximum length contained in U, while £(U) 
is a <£>-image of minimum length containing U. We denote h v (U) = c, t v (U) = 
d, h^(U) = c, and t^(U) = d. First, we establish some basic properties of £ 
and rj. 

Lemma 1.4. 1) Let eUf G where e, f G {a, b, A} and U is not 

letter- alternating. Then £(U) = eUf, e = h^(U), and f = t^(U). 

2) Let U = clf'd, where c,d G {a, b, A} and U' G <£?(£*). T/ien [/ zs 
uniform. Moreover, ifU is not letter- alternating, then rj(U) = U' , h v (U) = c, 
and t v (U) = d; 

3) Let U = cU'd, where the word U' is uniform and c,d G {a, b, A} such 
that c 7^ U[l], d ^ U[\U\]. Then the word U is uniform as well. 

Proof. To prove 2), note that U = cU'd and U' G <p(E*) imply U < 
ccll'dd G y?(E*). Hence the word C/ is uniform. If Z7 is not letter-alternating, 
then the representation cU'd is unique, so we are done. Statement 3 of the 
lemma is evident. Indeed, from c ^ U'[l] and d ^ U'[\U'\] it follows that 
all factors aa or bb occur in U inside the factor U' only. Adding the symbol 
c to the beginning of U', we change (if c ^ A) or save (if c = A) parity of 
all positions in the word U' . Therefore, since U' is uniform, the word U is 
uniform as well. Finally, one can see that if eUf = f(X) for some X G £*, 
then U = e(p(X')f, where X' < X. In view of the uniqueness of such 
representation, we conclude that rj(U) = f(X'), h v (U) = e, and t v (U) = f. 
From the definitions of and t% it immediately follows that h^(U) = e and 
t^(U) = f. The proof is complete. □ 
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The following observation describes the equivalence classes of all letter- 
alternating words. 

Observation 1.3. The equivalence classes containing letter-alternating 
words are: [a] = a, [ab] = oh, [aba] = aba, [abab] = (ab) 2 (ab)*, [ababa] = 
(ab) 2 (ab)*a, and their negations. 

Now we prove that the functions h^, t^, h^, and t v are invariant under the 
congruence ~. 

Proposition 1.8. Let U,V be uniform words and U ~ V. Then 

1) ht(V) = ht(V), tt(U)=te(V); 

2) h v (U) = h v (V), t v (U) = t v (V). 

Proof. If U or V is letter-alternating, then the proposition readily follows 
from Observation 11.31 Assume that they are not letter-alternating. 

Let us prove the first statement. By definition of congruence, U ~ V 
implies h^{U)Ut^{U) ~ h^{U)Vt^{U). From U ~ V and the definition of £ 
it follows that ht(U) ^ U[l] = V[l] and t^U) ^ U[\U\] = V[\V\\. Hence, 
by Lemma 11.44 3), the word h^(U)Vt^(U) is uniform. So we have £(U) ~ 
h^{U)Vt^{U), where £(U) G <^(E*) and the word h^(U)Vt^(U) is uniform. 
According to Proposition II. If we get h^(U)Vt^(U) G y?(£*) as well. In view 
of Lemma [L4"] 1), we have h^(V) = h^(U) and t^(V) = t^(U), as desired. 

Statement 2 is trivially follows from statement 1 and the definitions of 
h v , t v , h^, and t%. □ 

According to Proposition II. 8| the function £ preserves the congruence ~ 
whereas the function r\ does not preserve ~ in general case. However, as we 
will show in Sect. El under certain conditions rj preserves ~ as well. The 
function £ is used in the proof of Theorem (H while the function rj plays a 
crucial role in the construction of Algorithm EqAOF. 

Finally, we note that the functions <^ _1 (£(V)) and ip^irjiV)) preserve the 
property of a word to be almost overlap-free. For £ this was proved in [14] , 
for 7] even a stronger assertion holds. 

Observation 1.4. If V is almost overlap-free, then the word V -1 (7?(V)) 
is overlap-free. 

Proof. An almost overlap-free word that is not overlap-free has the form 
cYcYc for some cGE and Y G £*, hence, it has odd length. Since the length 
of 7)(V) is even, r](V) is a proper factor of V and, hence, it is overlap-free. 
The function tp~ l preserves overlap-freeness, so the observation follows. □ 

2 Proof of Theorem [TJ 

We prove Theorem 1 in the form (*) and use the minimal counterexample 
method. Suppose that a pair (U, V) provides a counterexample, i. e., U and 
V are two nonequal ri-reduced almost overlap-free words such that U ~ V, 
and / = min{|{7|, \ V\} takes the minimum value among all such pairs. We 
aim to get a contradiction by obtaining a "shorter" counterexample. 

Clearly, / > 2. By Lemma 11.3] any equivalence class [W] ri for W G Si 
contains exactly one almost overlap-free word (namely, the word W itself). 
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Therefore U, V ^ S±. Similarly, by Observation II .3[ the words U and V are 
not letter-alternating. It follows from Corollaries 11.11 and 11.21 that Tt(U) ~ 
rr(V) and tt{U), tt{V) are uniform almost overlap-free words. Hence, by 
Proposition 11.81 the words U% = y? _1 (£(7Y (£/))) and V\ = <£ _1 (£( r T(V))) are 
equivalent. Obviously, both words U\ and V\ are almost overlap- free and 

Since the words U and V are not letter-alternating, the words ^(tt{U)) 
and ^(rT(V)) are not letter-alternating as well. In particular, both words 
^(rriU)) and £(Vt(V)) are not equal to ababab or bababa. Therefore Ui, V\ G" 
{aaa, bbb}. 

Finally, we have h = min{|L r i|, \ Vi\} < (I + 2)/2 < / whenever I > 2. 
So the pair (Ui,Vi) provides a shorter counterexample. This completes the 
proof of Theorem [TJ 

3 Procedure Ancestor 

In this section we begin the construction of Algorithm AqEOF, which 
returns the almost overlap-free word that is equivalent to a given word or 
reports that no such almost overlap-free word exists. Algorithm AqEOF 
includes two main procedures. Let us define the first of them. 
Procedure Ancestor. 
Input. A word U G S + . 

Output. A word Anc([7) G S + , integer k, length k arrays L, R, h, t of letters. 
Step 0. Let k := 0. 

Step 1. Let U = n(U); k = k + 1, L[k] = R[k] = h[k] = t[k) := A. 

Step 2. If \U\ < 2 or U G [aabaabbaabaa] U [bbabbaabbabb], then Anc := U; 

stop. 

Step 3. If U has a non-uniform left (right) tail, set L[k] := U[l] (resp., 
R[k] = U[\U\]); let U^= r T {U). 

Step 4. If U' is not AB-whole or U' has a non-reducible tail, then Anc := U; 
stop. 

Step 5. Let U' := r(U'). 

Step 6. Let h[k) := h v (U'); t[k) := t v (U'); U := ^(^{U'))- goto step 1. 
End. 

Starting with U\ = ri(U), procedure Ancestor constructs the sequence of 
words Ui, U>2, . . . by the rule: 

f/ fc+ i = r 1 (^ 1 (r ? (r(r T (^ 1 ))))) 

until one of the stop conditions is fulfilled. We call the sequence {Uk} the 
primary U-series. Since \Uk\ < \Uk-i\/2 for any k > 2, the primary [/-series 
is finite, its length (that is, the number of words in it) is denoted by £{U). 
We say that the output word Anc(£/) = Ug^u) is the ancestor of U, and the 
arrays L = Ljj, R = Ru, h = hu, and t = tu returned by Procedure Ancestor 
are associated with U. We omit the index U if it is clear from context. 

The next two lemmas establish the basic properties of primary series. 
First, we examine primary series of almost overlap-free words. 

Lemma 3.1. Let V be an almost overlap-free word and {Vfcjj^i be its 
primary V -series. Then 
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1) Anc(V) G <Si U {a, b, ab, ba, aa, bb}; 

2) Vk is overlap-free for each k = 2, . . . , £{V); 

3) V k = L[k]r{r T {V k ))R[k} for each k = 1, . . .,£(V)-1. 

Proof. Instead of l)-3) we prove the following statement: 

if for some k > 1 £/ie word 14 zs almost overlap-free, \Vk\ > 2, and 
Vk S\, then Vk = L[k]r(rT(Vk))R[k], k < £(V), and the word Vk+\ is 
overlap-free. 

Consider the fcth iteration of procedure Ancestor, i. e., the processing of 
the word Vk- By the conditions above, procedure Ancestor cannot stop on 
Step 2. By Proposition II. 6[ rj-(Vfc) is a uniform almost overlap-free word. 
Hence procedure Ancestor cannot stop on Step 4 as well. Thus procedure 
Ancestor does not stop on the fcth iteration, so we have k < £(V), and the 
word Vk+i = ri(ip~ l {ri{rT(Vk)))) is overlap-free in view of Observation 11.41 
From the uniformity of rx(Vk) it follows that r(rT(14)) = Pr(Vfe). Since the 
function r? deletes at most one letter from the beginning and at most one 
letter from the end of any almost overlap-free word (see the remark after the 
definition of r T ), we get Vk = L[k]r(r T (Vk))R[k], as desired. □ 

Now we apply procedure Ancestor to a pair (U, V) of equivalent words. 

Lemma 3.2. Suppose that U ~ V, m = mm{£(U), £(V)} , {U k }k=l, 
{VkYk=l are the primary U- and V -series respectively, and the following con- 
dition: 

(**) r(r T (U k ))) ~ r(r T (V k ))) rj(r(r T (U k ))) ~ r](r(r T (V k ))) for all k < m 

holds. Then 

1) £(U)=£(V); 

2) Uk ~ Vk for each k = 1, . . . , m; in particular, Anc(LQ ~ Anc(V); 

3) Lfj = Ly, R(j = Ry, hu = hy, and tu = ty. 

Proof. Instead of l)-3) we prove the following statement: 

if Uk ~ Vk for some k < m, then Ljj[k] = Ly[k], Rjj[k] = Ry[k], 
hu[k] = hy[k], tu[k] = ty[k], and either k = m = £(U) = £{V) or 
k < m and Uk+i ~ Vk+i. 

First, suppose that k = m = £{U). If procedure Ancestor stops to process 
the word Uk on Step 2, then the processing of the word V k should stop on 
Step 2 as well, because Uk ~ Vk and both words are rj-reduced. In this case, 
we have £{U) = £{V), and the fcth elements of all arrays associated with U 
and V are equal to A. 

Now suppose that procedure Ancestor stops to_process the word Uk on 
Step 4. Assume additionally that the word Vfcjs v4I?-whole and has no non- 
reducible tails. Then the word Pr(14) is ^4-B-whole as well and we have 
rr(C4) ~ r T{Vk) by Proposition 11.71 2). At the same time, from Lemma [1.11 
and Proposition 11.31 it follows that the word rT^Uk) is AB-who\e and has no 
non-reducible tails as well as r T (Vk). So procedure Ancestor cannot stop to 
process the word Uk on Step 4, a contradiction. Hence either the word Vk is 
not AB-whole or it has a non-reducible tail. In both cases procedure Ancestor 
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stops to process the word Vk on Step 4, and we get k = £(U) = £(V). Also, 
Ljj[k] = Ly[h], Rjj[k] = Ry[k] by Proposition ll.7[ 1) and hu[k] = tjj[k] = 
h v [k] = t v [k] = A. 

The case k = m = £(V) is symmetrical to above one. Finally, suppose 
that k < m. We have r{r T {Uk)) ~ ^(^r(^fc)) by Propositions 11.41 and 11.71 
From condition (**) it now follows that r](r(rT(Uk))) ~ v{ r { r T{Vk))) whence 

£4 + i = ni^^riMUk))))) ~ ni^^riMVk))))) = v k+1 . 

In additional, we have Lu[k] = Ly[k], Rjj[k] = Rv[k] by Proposition 11.71 
1) and hu[k] = h v [k], tu[k] = ty[k] by Proposition 11.81 This completes the 
proof of the statement. □ 

As we said above, under certain conditions, the function 77 preserves the 
congruence ~. In the sequel, we call a pair (P, Q) of equivalent uniform 
words good if i](P) ~ i]{Q), and bad otherwise. So, condition (**) provides 
that all pairs (r(r T (f4)), r{r T {Vk))) are good. We set the study of bad pairs 
aside for Sect. [5j 

4 Normal series 

The notion in the headline plays a crucial role in the second main proce- 
dure of Algorithm EqAOF (this procedure will be presented in Sect. |6]). This 
notion is defined as follows. 

Take a word U and a cube-free word W G [Anc(£/)]. The normal Uy/- 
series or the W -normal series ofU is the sequence of words Ut(u), U^u)-i, ■ ■ ■ , U\ 
defined by 

U m = W, U k = LulkjhuikMUk+M^Ruik], k = £(U)-1, . . . , 1. 

The word U\ is called the W -normal (or simply a normal) form of the word 
U and denoted by N\y(U). Clearly, any word U has at least one normal 
series, since the class [Anc({7)] contains cube-free words. However, the class 
[Anc(i7)] can contain several cube-free words, therefore U can have several 
normal series. If W is almost overlap-free, then the normal LV-series is the 
main normal U -series, the W-normal form of U is called the main normal 
form of U and denoted by N(U). By Theorem [TJ each word U has at most 
one main normal series. A normal series or main normal series is called direct 
if W — Anc(Z7). In this case, the normal form Nw(U) is called direct as well 
and denoted by N D (U). 

First, we prove the main property of normal forms. 

Lemma 4.1. Suppose that W ~ Anc(Z7) for a word U and a cube-free 
word W . Let {UkYk=l an d {UkY k =i ^ e ^ e V r ' l ' mar V U -series and the normal 
Uw-series respectively. Then Uk ~ £4 f or each k = 1, . . . , £(U); in particular, 
N W (U) ~ U. 

Proof. For k = £(U), we have Unm = W and [/wm = Anc(C/) from 
definitions. Now suppose that Uk+i ~ £4+i for some k < £{U). If we show 
that Uk ~ Uk, the required statement will follow by induction. We have 

Uk+i ~ U k+ i = r 1 (^- 1 (r / (r(r T (f/ fc ))))) 



15 



whence we get U k +i ~ <p 1 {il{r{r T {U k )))) and h[k]tp(U k+ i)t[k] ~ r(r T (U k )). 
Since procedure Ancestor does not stop on fcth iteration, the word rr(C4) is 
/IB-whole and has no non-reducible tails. From Proposition ll.4l it now follows 
that r(r T (U k )) ~ r T (f/ fc ). Hence, /i[A;]y>(E4+i)t[A;] ~ r T (U k ). Note that if U k 
has non-uniform tails, then the word L[k]rT(U k )R[k] has non-uniform tails of 
the same kind as the word U k . Since all no n- uniform tails of the same kind 
are equivalent, we conclude that L\k]rT(U k )R[k] ~ U k . Thus, we have 

U k = L[k]h[k}<p(U k+1 )t[k}R{k] ~ L[k]r T (U k )R[k] ~ U k , 

as desired. □ 

So, normal series are "inverted" to primary series in a sense. Namely, 
by Lemma [4.1[ if a cube-free word W is equivalent to Anc([7), then the W- 
normal series of U allows one to restore the primary series {U k Y k =l up to 
equivalent words. Actually, the [/vy-normal series consists of the words U k 
with more simple structure than U k in the general case. The following lemma 
describes the structure of the words U k . 

Lemma 4.2. Suppose that Anc([7) ~ W for a word U and a cube-free 
word W . Let {U k } £ k =Q be the W -normal series of U . Then 

1) U k is an r\-reduced word for each k = 1, . . . , £(U); 

2) rrp{U k ) is uniform for each k = 1, . . . , £{U) — 1; 

3) r T (U k l= h[k]<p(U k+1 )t[k] for each k = 1, . . . , £(U)-1; 

4) h(r T (U k )) = h[k\, t(r T (U k )) = t[k], and r](r T (U k )) = <p(U k+1 ) for each 
k = 1,...,£(U)-1. 

Proof. For k = £(U), there is nothing to prove. Suppose k < £(U). Let 
us denote the word h[k]tp(U k+ i)t[k] by U' k , and let {UkYk=l ^ e ^ ne primary 
[/-series. The word U' k is uniform by Lemma [L4"t 2). Since U k +i ~ £4+i ~ 
tp~ 1 (r)(r(rT{U k )))) by Lemma |4.1[ we get U' k ~ r(rx{U k )). From Proposi- 
tionOlit now follows that h(U' k ) = h[k], t{U' k ) = t[k], and r](U£) = (p(U k+ i). 

It remains to prove that U k is an ri-reduced word and rT{U k ) = U' k . We 
have U k = L[k]U k R[k}. First, we apply the function r l T to the word L[k]U' k . 
Clearly, if L[k] = A, then r l T (L[k]U' k ) = U' k . Now suppose that L[k) ^ A. 
Without loss of generality we assume that L[k] = a. This means that U k 
has the non-uniform left tail (aab) k (aab) 2 ba for some k > 0. Hence the word 
fiXriUk)) has the prefix abaabba. Therefore the word U' k , which is equivalent 
to r(rT{U k )), begins with abaabba as well. Since the word L[k]U' k does not 
begin with aaa, it is ri-reduced. So we obtain r l T (L[k]U' k ) = U' k . 

In the symmetric way one can prove that U' k R[k] is an ri-reduced word 
and rj,(U' k R[k]) = U' k . So, the word U k = L[k)U' k R[k) is rx-reduced and we 
have 

r T (U k ) = rUr l T (L[k}U k R[k})) = r r T {U' k R{k\) = U>. 
This completes the proof. □ 

Next we investigate primary series and normal series for normal forms. 

Lemma 4.3. Let Anc([/) ~ W for a word U and a cube-free word W . 
Then both the primary Nw{U) -series and the W -normal series of N\y(U) 
coincide with the Uw-normal series. 
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Proof. Let us denote P = N W (U), m = £(U), I = £(P), and let {U k }% =1 , 
{Uk}™ =1 , and {Pk} l k= i be the primary series of U, the [/viz-normal series, and 
the primary series of P respectively. First, we prove that the primary P-series 
coincides with the normal LV-series. 

By Lemma I4T21 the word P is ri-reduced. Hence, P± = P = Ny/{U) = U\. 
Now suppose that Pk = Uk for some k < m. In view of Uk ~ Uk and 
k < £(U), we conclude that \Pk\ > 2 and Pk [aabaabbaabaa]U[bbabbaabbabb]. 
In addition, the word rp(Pk) is uniform by Lemma 14.21 Hence procedure 
Ancestor does not stop while processing the word Pk, that is, k < I, and we 
have r(rT{Pk)) = friUk)- Finally, by Lemma [4.21 we get 

P k+ i = n(<p- l ( V (r(r T (P k ))))) = r 1 {v-\i 1 {r T {Uk)))) = 

ri(v9 _1 (v9(f/ fe+ i))) = ri(£7 fc+1 ) = U k+ i, 

as desired. 

Now consider the word P m = U m = W ~ Anc(£7). One can easily check 
that if procedure Ancestor stops processing the word Anc(£7) on Step 2 or 
Step 4, then it stops processing the word P m on Step 2 or respectively Step 
4 as well. Thus, I = m, and the primary P-series coincides with the normal 
[/^-series. 

Clearly, since Pk = Uk ~ Uk, we have Ljj[k] = Lp[k], Ru[k] = Rp[k], 
hu[k] = hp[k], and tu[k] = tp[k] for each k = 1, . . . ,m. This implies that 
the Pjy-normal series coincides with the L^y-normal series. The proof is 
complete. □ 



In particular, it follows from Lemma fl~3l that Nw{Nw{U)) = Nw{U) for 
any word U and any cube-free word W such that W ~ Anc(£7). So, the 
repeated use of the primary and normal ^/-series has no effect. 

Finally, we study the main normal series for almost overlap-free words. 

Lemma 4.4. If U is an almost overlap-free word and U ^ {aaa,bbb}, 
then there exists the main direct normal U -series and it coincides with the 
primary U -series; in particular, N(U) = U. 

Proof. Let {U k }k=l and {U k }f=l be the primary and the direct normal 
[/-series respectively. We prove the lemma by induction on k = £(U), . . . , 2, 1. 
For k = £{U), we have U^jj) = Anc(£/) = U^u) by the definition of direct 
series. Note that since the word Anc([7) is almost overlap-free by Lemma IXTl 
the direct normal series {UkYk=l * s ^ ne ma i n direct normal series of U. 

Now suppose that Uk = Uk, where 1 < k < £{U), and prove that Uk-i = 
Up.-i- Let U' k = rj{r{r T {Uk-i))) ■ First, we prove that the word is 
rx-reduced. Indeed, if aaa < y? _1 (?7() (or bbb < ip^ 1 {U' k )), then ababab < U' k 
(resp., bababa < U' k ). However, it is impossible, since U' k < rirxiUk-i)) < 
Uk-i by Lemma I3TT1 and Uk-i is almost overlap-free. Hence the word ip~ l {U' k ) 
is ri-reduced and Uk = ri(ip~ l {U' k )) = ^~ l {U' k ). Therefore we obtain 

U k -i = L[k-l]r(r T (U k -i))R[k-l} (by Lemma EU) 

= L[k-1] h[k-l]U' k t[k-l]R[k-l] = L[k-1] h[k-l]<p(U k ) t[k-l]R[k-l] 

= L[k-1] h[k-l]<p(Uk) t[k-l]R[k-l] = Uk-i, 
as desired. □ 
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Lemmas 13. 1\ 13. 2[ and 14.41 together provide the following assertion: 

if a word U is equivalent to an almost overlap-free word V ^ {aaa, bbb} 
and condition (**) holds, then V = N(U). 

Indeed, from Lemma 13.21 it follows that Anc(C/) ~ Anc(V), Ljj = Ly, 
Ru = Rv-i hjj = h v , and % = t v . By Lemma 13.11 the word Anc(V) is 
almost overlap-free, in particular, it is cube-free. Thus the Anc(V)-normal 
series of the words U and V coincide. Since the word Anc(U) is almost 
overlap-free, these normal series are the main normal series of the words U 
and V. Hence, N(U) = N(V). By Lemma WM we get V = N(V) = N(U), 
as desired. 

We conclude this section explaining why do we use the function rj, not 
£, for the construction of Algorithm AqEOF. Replacing r\ by £ in procedure 
Ancestor, we simplify the formulation of Lemma [3T2| since condition (**) can 
be replaced by the following condition, which is obviously true: 

r(r T (U k )) ~ r(r T (V k )) £(r(r T (U k ))) ~ £(r(r T (V k ))) for all k < m, 

where m = min{£(U), £{V)}. Clearly, Lemmas 13. 1\ \3.2\ and 14.41 hold true 
after replacing r] by £. So, if we retain the same notion of normal series, we 
will get the following assertion: 

if a word U is equivalent to an almost overlap-free word V ^ {aaa, bbb}, 
then V = N(U). 

Unfortunately, Lemma T4. II fails if we use £ instead of rj. Actually, in this 
case the word r T {U k ) will be obtained from ip(U k+ i) by deleting the symbols 
h[k] = h^(r(r T (U k ))) and t[k] = t^{r{r T {U k ))) added to the word r(r T (U k )) 
by procedure Ancestor. However, deleting does not preserve the congruence 
~ in general case. For example, consider the word U = bababb. Using £ 
instead of rj, we get N (bababb) = babb. The word N(U) is almost overlap- 
free, but babb ^ bababb. In fact, the equivalence class [bababb] contains no 
almost overlap-free words. So, we simplify the necessary condition for a 
given word U to be equivalent to an almost overlap-free word, but we lose 
the sufficient condition provided by Lemma 14. 1L On the other hand, the use 
of r) makes Lemma fl~Tl true. This is the main reason why the function t], not 
£, is used in the construction of Algorithm EqAOF. Surprisingly, as we will 
prove in the next section, under certain restrictions, the function rj preserves 
the congruence ~. Moreover, even if U ~ V and T](U) ^ i]{y) for some pair 
(U, V) (that is, the pair (U, V) is bad), then an analogue of Lemma I3T21 holds 
(see Lemma [5.41 below) . 

5 Bad pairs of equivalent uniform words 

This section is devoted to the study of bad pairs, for which U ~ V and 
rj(U) ^(V"). In the sequel, we write h(U) and t(U) instead of h v (U) and 
t v (U). 

First, we study bad pairs of neighbours. More precisely, we consider pairs 
(U, V) such that (U, V) e n and (r](U),rj(V)) & n. 
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Lemma 5.1. Suppose U and V are uniform, \U\ < \V\, (U,V) G tt, and 
(r](U),r](V)) ^ 7i. Then {h(U),t(U)} = {a, b} and there exists an even-length 
word Y such that \Y\ > 2, U = YY , V = YYY , h(Y) = h{U) = h{V), and 
t{Y)=t(U)=t(V). 

Proof. Since \U\ < \V\, we can write U = XYYZ and V = XYYYZ, 
where X, Z G E* and Y G E + . We show that the word Y is the required one. 

First, we prove that the length of Y is even. Assume the converse. If the 
word Y is letter-alternating, then V has the factor F[|F|]FF[1] of the form 
aAa or bBb, a contradiction with the uniformity of V. Now let Y = PccQ 
for some P, Q G E*, and c G E. Then we have ccQPcc < YY < U and 
ccQPcc < YY < V. This contradicts the uniformity of U and V again, since 
the length of PQ is odd. Hence, the length of Y is even. 

By Proposition Q3J we have h{U) = h{V) = c and t{U) = t(V) = d for 
some c,d G {a, b, A}. If |c| < \X\ and |d| < |Z|, then the words r)(U) and 
rj(V) are neighbours by definition. Hence, at least one of these inequalities 
should fail. 

Suppose that \d\ < \Z\ and |c| > \X\, that is, c ^ X and X = A. Then 
Y = cY'e for some Y' G S* and e G S. So, we get r?(£7) = r'ecF'eZ' 
and T](V) = Y'ecY'ecY'eZ' , where Z' G E*. By Observation II. If the length 
of Z 1 is odd, in particular, Z' ^ A. Also, we have Z'[l] = e = c, since 
the factors ec and eZ'[l] start in at odd positions. Thus the words 

V (U) = {Y'ecfZ" and 7]{V) = {Y'ecfZ", where Z" = Z'[2 . . . \Z'\], are 
neighbours, a contradiction. 

The case |c| < \X\ and \d\ > \Z\ is symmetric to the above one. Thus, 
X = Z = A and c,d ^ A. Then U = YY , V" = FFF, = c, and 

Y[\Y\] = d. Let Y = cY'd, where F' G E*. Then 77 (£7) = Y'dcY' and 77 (K) = 
F'dcF'dcy. By Observation [TTTJ we have F' G y(E*) and c = d. If the 
word F is letter-alternating, then the words U and V appear to be (^-images, 
since the length of F is even. However, this contradicts the assumption 
h(U),t(U) 7^ A. Hence, F is not letter-alternating. From Lemma ll.4[ 2) it 
now follows that h(Y) = c and t(Y) = d. 

Note that F 7^ cd = cc, since the word F is not letter-alternating. There- 
fore, |F| > 2. This completes the proof. □ 

As a consequence we get the next proposition. 

Proposition 5.1. If a pair (U,V) is bad, then 

{h(U),t(U)} = {h(V),t(V)} = {a,b}. 

Proof. Let {Wfc}^_ be an r-linking (U, F)-series. Clearly, if all the pairs 
(Wk-i, Wk) are good, then the pair (U, V) is good as well. Hence, there exists 
an integer k' > 1 such that the pair (Wk'-i, Wk 1 ) is bad. By Lemma [5. II we 
have {h(Wk>),t(Wk>)} = {a, b}. The required statement now follows from 
Proposition 11.81 □ 

Now consider a bad pair (YY, YYY). By Lemma 15.1 [ we have 
<p-\v(YY)) = (p' 1 (r](Y)t(Y)h(Y)r](Y)) = V -\ V {Y))t{Y) V -\ V {Y)) 
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and 



ip- l { V (YYY)) = (p~ l (ri{Y)t(Y)h{Y)ri(Y)t{Y)h{Y)ri(Y)) 

= V ~ 1 ( V (Y))t(Y) V -\r ] (Y))t(Y) V - 1 ( V (Y)). 

So, we get a pair of the form (XcX,XcXcX), where X G S + , c 6 E, and 
XcX rfj XcXcX. For the sequel, we need to establish some useful properties 
of such pairs. 

Lemma 5.2. Let XcX / XcXcX for some X G S+ ; c G S. T/ien 

1) rx(XcX) = r 1 (X)cr 1 (X), n(XcXcX) = n(X)cri(X)cri(X). 
Suppose additionally that the words XcX and XcXcX are ri-reduced. Then 

2) // X ^ cc ; tfien XcX, XcXcX £ [W] n /or any W G Si. 

3) // one o/ £/ie words XcX and XcXcX is letter- alternating, then all 
the words XcX , XcXcX, and X are letter- alternating and have odd length. 

4) If one of the words XcX and XcXcX has a non-uniform tail, then 
the other has the same non-uniform tail. 

5) If one of the words XcX and XcXcX has a non-reducible tail, then 
the other has the same non-reducible tail. 

6) If the word XcXcX is A-whole (B -whole), then both X and XcX 
are A-whole (resp., B-whole). If X ^ {ab, ba} and the word XcX is A-whole 
(B-whole), then both X and XcX cX are A-whole (resp., B-whole). 

7) If the words XcX and XcXcX are AB-whole and have no non- 
reducible tails, thenr(XcX) = r(X)cr(X) andr(XcXcX) = r(X)cr(X)cr(X) 

8) If the words XcX and XcXcX are uniform, then the length of X is 
odd, h(XcX) = h(XcXcX) = h(X), and t(XcX) = t(XcXcX) = t(X). 

Proof. Prove the first statement of the lemma. Clearly, if 

aaa G {X[|X|-1] X[|X|]a, X[|X|]aX[l], aX[l] X[2]}, 

then XaXaX ~ XXX ~ XX ~ XaX, in contradiction with the lemma's 
condition. Hence, the factors aaa and bbb can occur in the words XaX and 
XaXaX inside the factor X only. This proves the first statement. 

Now suppose that both words XaX and XaXaX are already ^-reduced. 
Prove the second statement. Assume the converse. Then V G [W] ri for 
some words V G {XaX, XaX aX} and W G 5i. Note that the word Va 
is either a square or a cube. A straightforward check shows that no words 
V ^ [bbabbaabbabb] ri with this property satisfy the regular expressions listed 
in Lemma 11.31 Hence, W = bbabbaabbabb. By Lemma 11.31 the word V has 
the following form: 

{bba) 2 {bba)*{a{bba) + )*{abby{abb) 2 . 

In particular, we get X[l] = X[|X|] = 6. Therefore the factor aa occurs 
in V inside the factor X only. It is easy to check that the word X has 
the form (bba) 2 (bba)*(a(bba) + )*(abb) + as a prefix of V and has the form 
(bba) + (a(bba) + )* (abb)* (abb) 2 as a suffix of V. Combining these forms, we con- 
clude that X has the form above. Obviously, the words XaX and XaXaX 
has the same form as well. Thus we get X ~ XaX ~ XaXaX, in contra- 
diction with the lemma's condition. So, V is not equivalent to a word from 
Si. 
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If the word XaXaX (XaX) is letter-alternating, then the words XaX 
and X (resp., X) are letter-alternating as well. On the other hand, since the 
factors aa and bb occur in XaXaX inside the factor XaX only, the word 
XaXaX is letter-alternating whenever XaX is. As it will be shown in the 
proof of 8), the length of X is odd whenever the words XaX and XaXaX are 
uniform (in particular, letter-alternating). Hence, the lengths of both words 
XaXaX and XaX are odd as well. The proof of statement 3 is complete. 

Prove statements 4-5. Obviously, any tail of the word XaX (non-reducible 
or non-uniform) is the tail of XaXaX as well. Conversely, let T be a tail 
of XaXaX. If \T\ < \XaX\, then T is a tail of XaX as well, and there is 
nothing to prove. Suppose that \T\ > \XaX\. Without loss of generality we 
may assume that T is a left tail. So, XaX is a proper prefix of T. Consider 
all possible cases depending on kind of the tail T. 

Clearly, that T is not a non-uniform A-tail. Actually, since the words 
XaX and XaXaX are rj-reduced, the word X cannot begin with aa. Now 
suppose that T is a non-uniform 5-tail, that is, T = (bba) 2+k ab for some 
k > 0. Since the word XaX is a proper prefix of T, the word Xa is a prefix 
of the word (bba) 2+k . Therefore X has the form (bba)*bb. Hence, the words 
XaX and XaXaX have the form (bba)*bb as well and have no tails, which 
is impossible. 

If T is a non-reducible 5-tail, that is, T = (bab) k baba(ba) l bb for some k > 
and I > 0, then X begins with babb, in particular, the word X is not letter- 
alternating. On the other hand, the word T does not contain the factor aa, 
therefore X[\X\] = b. Thus, we have baba = X[\X\]aX[1..2] < XaX. Since 
XaX is a prefix of T, we conclude that \Xa\ > \(bab) k \ and X < (ba) l+2 b. 
So, the word X appears to be letter-alternating, a contradiction. 

Finally, let T = (aba) k abab(ab) l aa for some k > and / > (T is a non- 
reducible A-tail). Obviously, X[l] = a. Since the words XaX and XaXaX 
are ri-reduced, we have X[|X|] = b. Also, from aa = aX[l] < aX it follows 
that \X\ < \(aba) k \. Thus, the word X has the form (aba)*ab. It is easy to 
check that the words XaX and XaXaX have the form (aba)*ab as well and 
have no tails. This contradiction completes the proof of statements 4-5. 

Now prove statement 6. First, we prove the following assertion: 

if a word P is a prefix and a suffix of a word Q and Q is A- or B-whole, 
then the word P is A- or B-whole respectively as well. 

Indeed, let Q be A-whole and P = Sa(ab) k aaT for some words S,T e £*, 
and integer k > 0. Since the word P = Sa(ab) k aaT is a prefix of the word Q 
and Q is A- whole, we have S = S'ab for some S' 6 E*. On the other hand, 
the word Sa(ab) k aaT is a suffix of Q, therefore T = baT' for some T' G S*. 
Thus, we get P = S'ab a(ab) k aa baT 1 ', that is, the factor a(ab) k aa occurs in P 
inside the factor (aba)(ab k )a(aba) . This means that the word P is A-whole, 
as desired. The same argument works for a B- whole word Q. So, the first 
part of statement 6 is proved. 

To complete the proof of statement 6, suppose that the word XaX is 
A- whole (or £>-whole) and show that the word XaXaX is A (resp., or B- 
whole) as well. If XaXaX contains no factor of the form aAa (resp., bBb), 
there is nothing to prove. Now let XaXaX = RST, where R,T e £*, 
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and S = a(ab) k aa (resp., S = b(ba) k bb) for some integer k > 0. Obviously, 
either S < XaX or aXa < S. In the first case, the word S = a(ab) k aa 
(resp., S = b(ba) k bb) occurs in XaXaX inside the factor aba(ab) k aaba (resp., 
bab(ba) k bbab), as desired. In the second case, the word X is letter-alternating. 
Clearly, if the length of X is odd, then both words XaX and XaXaX are 
letter-alternating as well, in contradiction with the assumption S < XaXaX. 
Thus, X = (ab) 1 or X = (ba) 1 for some integer I > 1 whence XaXaX = 
(ab) 1 a(ab) 1 a(ab) 1 or XaXaX = (ba) 1 a(ba) 1 a(ba) 1 . Since X G" {ab,ba} by the 
lemma's condition, we have I > 1. Obviously, both words (ab) l a(ab) l (ab) and 
(ba) 1 a(ba) 1 a(ba) 1 are AB-whole whenever I > 1. This completes the proof of 
statement 6. 

Now suppose that the words XaX and XaXaX are AB-whole and have 
no non-reducible tails. Prove statement 7. By Proposition ll.4[ we get 
XaX ~ r(XaX), XaXaX ~ r(XaXaX). Hence, by Proposition 11.31 and 
Lemma [TTTl the words r(X)ar(X) and r(X)ar(X)ar(X) are v4I?-whole, have 
no non-reducible tails, and satisfy obvious equalities r(XaX) = r(r(X)ar(X)) 
and r(XaXaX) = r(r(X)ar(X)ar(X)). So, we will assume that the word 
X is already r-reduced and prove that the words XaX and XaXaX are 
r-reduced as well. 

First, prove that the word Xa is r-reduced. If it is not r-reduced, then 
Xa = X'a(ab) k aa for some X' G E*. After the reduction a(ab) k aa — > aa 
inside the words XaX and XaXaX, we obtain two equivalent words: 

XaX = X' a(ab) k aa X' a(ab) k a ->■ X'aaX'aa(ba) k = U, 

XaXaX = X' a(ab) k aa X' a(ab) k aa X' a(ab) k a ->■ X'aaX'aaX'aa(ba) k = V. 

It follows from Proposition 11.41 that XaX ~ U and XaXaX ~ V. So, we 
get XaX ~ XaXaX, which is impossible. Hence, the word Xa is r-reduced. 

In a symmetric way one can prove that the word aX is r-reduced as 
well. So, if the word XaX is not r-reduced, then it can be decomposed as 
XaX = PcQaRcS, where P.Q.R^S G E*, c e S, PcQ = RcS = X, and 
the factor cQaRc has the form aAa or bBb. Clearly, the prefix R and the 
suffix Q of the word X do not overlap inside X, since the letter- alternating 
word R cannot contain the factor cQ[l] = cc. Thus, X = RTQ, where either 
T = A or T[l] = T[|T|] = c. So, the reduction cQaRc — > cc transforms the 
pair (XaX, XaX aX) to the pair of neighbours (U, V): 

XaX = RTQaRTQ RTTQ = U, 
XaXaX = RT QaRT QaRTQ -> RTTTQ = V. 

Since XaX ~ U and XaXaX ~ V, we get XaX ~ XaXaX, which is 
impossible. Hence, the word XaX is r-reduced. 

Finally, suppose that the word XaXaXjs not r-reduced and let S be 
a factor of XaXaX of the form aAa or bBb. Then we have aXa < S, 
because the word XaX is r-reduced. So, the word X is letter-alternating. 
If the length of X is odd, then the word XaXaX is letter-alternating as 
well, a contradiction with the assumption that the word XaXaX is not r- 
reduced. Hence, the length of X is even whence XaXaX = (ab) k a(ab) k a(ab) k 
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or XaXaX = (ba) k a(ba) k a(ba) k for some k > 1. The condition k = 1 
implies that the word XaXaX is not AB-who\e. From k > 1 it follows that 
r(XaXaX) = XaX = r(XaX) and XaXaX ~ XaX. Since both case are 
impossible, we finished the proof of statement 7. 

It remains to prove statement 8. Let XaX and XaXaX be uniform. 
Obviously, the word X is uniform as well. First, suppose that the word X 
is letter-alternating. In this case, the length of X is odd, since the words 
(ab) k a(ab) k a(ab) k and (ba) k a(ba) k a(ba) k are not uniform for any k > 1. 
Hence, all the words XaX, XaXaX, and X are odd-length letter-alternating 
words, therefore h(XaX) = h(XaXaX) = h(X) = A and t(XaX) = 
t(XaXaX) = t(X) = X[\X\}. 

Now suppose that X[z..z+1] = cc for some letter c and integer i. Then 
the factor cc occurs in XaX at the i-th and at the \Xa\ + i-th positions. 
Since the word XaX is uniform, we conclude that the length of X is odd. 
Hence, all the words XaX, XaXaX, and X have odd length. Moreover, by 
the definition of r], if i is odd, then h(XaX) = h(XaXaX) = h(X) = X[l] 
and t(XaX) = t(XaXaX) = t(X) = A, otherwise h(XaX) = h(XaXaX) = 
h(X) = A and t(XaX) = t(XaXaX) = t(X) = X[\X\]. This completes the 
proof of the lemma. □ 



Call especial attention to pairs of inequivalent words (XcX, XcXcX) 
with \X\ = 2. Clearly, the word XcX is overlap-free in any such pair and 
[XcJf] ri = XcX. For the word XcXcX, we have 

XcXcX G {abaabaab, abbabbab, baabaaba, babbabba, bbabbabb, aabaabaa] = £2- 
The next lemma describes the equivalence classes of all words from S2 ■ 
Lemma 5.3. 

[abaabaab] ri = (aba)* (aba) 2 ab, [abbabbab] ri = (abb)* (abb) 2 ab, 
[baabaaba] ri = (baa)* (baa) 2 ba, [babbabba] ri = (bab)* (bab) 2 ba, 
[bbabbabb] ri = (bba)*(bba) 2 bb, [aabaabaa] ri = (aab)*(aab) 2 aa. 

Proof. Clearly, any word of the form (aba)* (aba) 2 ab is equivalent to 
abaabaab. Conversely, if W G [abaabaab] ri , then aW G [(aa6) 3 ] ri = [aabaab] ri . 
By Lemma ll.3[ we have aW = (aab) k for some k > 2. Hence, W = 
(a6a) fc_1 a6. Since abaab ^ [abaabaab] ri , we have k > 2 and the word W has 
the form (aba)* (aba) 2 ab, as desired. Thus, [abaabaab] ri = (aba)* (aba) 2 ab. 
Now consider the equivalence class [bbabbabb] ri . Obviously, this class con- 
tains all words of the form (bba)*(bba) 2 bb. Conversely, let W G [bbabbabb] ri . 
Then Wa G [bbabba] ri whence Wa = (bba) k , where k > 2, by Lemma [1.31 
From bbabb G" [bbabbabb] it follows that k > 2 and W = (bba) k ~ 3 (bba) 2 bb. 
So, we get [bbabbabb] ri = (bba)*(bba) 2 bb. The other equivalence classes are 
symmetric to described above two classes. □ 

Now we ready to prove the main technical lemma of this section. 

Lemma 5.4. Let (U, V) be a bad pair such that the word V is almost 
overlap-free. Then there exists a word Y G S + such that V = Y 2 . Further- 
more, denote the word Y 3 by W and let {Y k } e ^, {Vk} 1 ^, {WkY^i , an< ^ 
{UkYk=l be the primary Y-, V-, W-, and V -series respectively. Then 
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1) rj(U)~7](W). 

2) £(W) - 1 < £{Y) < £(U) = £(W) < £{V) < £{W) + 1. 

3) U k ~ W k ^ V k for each k = 2, . . . , £{W). 

4) L v [k] = L v [k] = L w [k] = X, Ru[k] = R v [k] = R w [k] = X, h v [k] = 
h v [k] = h w [k], and t v [k] = t v [k] = t w [k] for all k < £(W); 

L Y [k) = R Y [k] = X, h Y [k] = h w [k], and t Y [k] = t w [k] for all k < £{Y). 

5) The words Y k , Vk, and Wk are uniform for each k — 1, . . . , £(W) — 1. 

6) There exists a sequence of letters {c^}^? such that Vk = Y k CkYk and 
W k = YkCkYkCkXk for each k = 2, . . .,£(W) (if £{Y) = £{W)-l, then we put 
Yi(w) = X). 

7) Ymy) G {ab,ba,aa,bb,a,b}. 

Proof. Recall that the words U and V are uniform, because the notion 
of bad pair uses the words tj(U) and rj(V). So, the set of all r-linking (V, U)- 
sequences is not empty by Proposition 11.51 For each sequence {i?j}" =0 from 
this set, define 

0{{Ri}? =o ) = Card({i | R() is bad, 1 < i < n}) 

and 

7({i2i}JLo) = min{i | (R^, Ri) is bad}. 

Note that the number 7({i?j}™ =0 ) is well defined: since the pair (V, U) is bad, 
we have /3({i^}? =0 ) > 1. 

Among all r-linking (V, ?7)-sequences, we choose the sequence {Ri}™ =0 
having the lexicographically minimal pair (/3({i?j}™ =0 ), j({Ri}™ =0 )). The proof 
consists of four steps. In steps 1-3 we prove that {3({Ri}f =Q ) = 1 and 
7({i?i}" =0 ) = 1, that is, the sequence {i?j}™ =0 has a unique bad pair of neig- 
bours (V, Ri). As a by-product of this proof, we will get almost all statements 
of the lemma. Step 4 is a short argument about the pair (Rx, U). 

Step 1. Let i = 7({-Rj}™ =0 )- We aim to show that i = 1, i. e., = V. 
To simplify the notation we denote the words Rj_i and Rj by P and Q 
respectively. By Lemma 15.11 there exists a word Y such that one of the 
words P and Q is equal to YY and the other is equal to YYY. Consider the 
primary Y-, P-, Q-, and \/-series {Y k }Zl {Pk)Zl {Qk}£ and {V k )Zl 
respectively. Clearly, Y x = Y, P 1 = P, Q 1 = Q, and V\ = V. By the 
definition of P and i, the pair (V, P) is good. Hence, we have rj(Vi) ~ 
T] (Pi) T](Qi) whence V% ~ P% ^ Q^. Moreover, we obviously have 

L v [l] = L P [1] = L Q [1] = L Y [1] = R v [l] = R P [1] = Rq[1] = R Y [1] = A 

and, by Proposition 11.81 and Lemma 15.11 

h(V)t(V) = h{P)t{P) = h{Q)t{Q) = h{Y)t(Y) G {ab,ba}. 

If we denote ip~ l (t](Y)) by Y', then we get 

(p-\ri(YY)) = Y'tf- x {t(Y)h(Y))Y' = Y't(Y)Y' 

and 

V - l (q(YYY)) = Y'ip- l (t(Y)h(Y))Y'ip- l (t(Y)h(Y))Y' = Y't(Y)Y't(Y)Y' 



24 



whence {P 2 ,Q 2 } = {Y 2 t{Y)Y 2 ,Y 2 t{Y)Y 2 t{Y)Y 2 } by Lemma I53| 1). 
Let us prove by induction on k that 

{Pk, Qk} = {Y k c k Y k , Y k c k Y k c k Y k } and V k ~ P fe 7^ (5.1) 

for some sequence of letters {c^}^ and each fc = 2, . . . , In addition, 

we prove that |Anc(F)| < 2. 

For k = 2, we put c 2 = t(Y), and (15. ip holds, as was shown above. 
Now suppose that (15.11) holds for some k such that \Y k \ > 2 and prove that 
k < £(Y) and (JET]) holds for k + 1 as well. 

Obviously, if \Y k \ > 2, then \P k \ > 2, |Qfe| > 2, and |V fc | > 2 as well. 
By Lemma [5.21 2), we have V k ^ Si- From Lemma f3.ll and Corollary 11.21 it 
follows that k < £(V), rr(-Pfc) ~ rr(Vfc), and both words rT(V k ) and rT(P k ) 
are AB-whole. 

We claim that rr(Pk) = P k and rr(Vfc) = V k , i. e., both words P k and \4 
have no non-uniform tails. Assume the converse. Let T be a tail of the word 
P k . Without loss of generality, T is a left A-tail, i. e., T = (aab)\aab) 2 ba 
for some I > 0. According to Lemma f5.2[ 4), the word Q k has the tail T as 
well. Since one of the words P k and Q k equals Y k c k Y k , we conclude that T 
is a prefix of Y k c k Y k . We have c k = b, because Y k begins with aa and the 
words P k and Q k are ri-reduced. If T < Y k , then both words rj<(Y k c k Y k ) and 
r T (Y k c k Y k c k Y k ) have the factor aabaabb inside the second occurrence of Y k . 
So, these words are not AB- whole by definition. But one of them is rxi^Pk), 
which is an AB-who\e word, a contradiction. 

Thus, Y k < T. Then Y k = (aab) l+2 and the words rx^Pk) and rr(Qfc) 
have the suffix Y k , which is not AB- whole. This contradiction proves that 
r r(Pk) = Pk- Since P k ~ V k , by Proposition 11.71 1) we have r T (V k ) = V k , as 
desired. 

From Lemma [5.21 4), 6) it follows that all the words P k , Q k , and Y k are 
AB-whole and have no non-uniform tails. Moreover, since the overlap-free 
word V k has no non-reducible tails, the words P k , Q k , and Y k have no non- 
reducible tails as well. Hence, procedure Ancestor cannot stop while process- 
ing any of the words V k , P k , Q k , andY" fc , that is, k < mm{£(V), £(P), £(Q), £(Y)} 

By Proposition 11.41 and Corollary 11.11 we get 

r{V k ) = V k ~r(P k )r/, r (Q k ). 

At the same time, we have 

{r(P k ),r(Q k )} = {r(Y k )c k r(Y k ),r(Y k )c k r(Y k )c k r(Y k )} 

by Lemma T5.21 7), 8), where r(Y k ), r(P k ), and r(Q k ) are odd-length uniform 
words. By definition, the function rj deletes exactly one letter from each word 
r(Y k ), r(P k ), and r(Q k ). According to Proposition 15. 11 we get 

v(Vk) ~ v(r(Pk)) v(r(Qk)) 
whence V k+1 ~ P k+1 ^ Q k+1 . 
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Additionally, if we put Y k ' = r(Y k ), we get either h(Yl) ^ A and t(Y k ') = A 
or t(Yl) 7^ A and h(Yl) = A. Consider the first case. Then 

{ V -\v(r{P k )))^-\r l {r(Q k )))} = 

{^~ 1 (^(^fc) c fc^(^fc) 7 7(^))> V ?_1 (^(^fc)cfc/i(1^0^(^) c fe^(^fe) 7 7(^))}- 
Hence, h(Y£) = c k and we get 

{ V -\v(riP k )))^-\r ] {riQ k )))} = 

{<p-\v<ya) w-'wd), v -\r,(xi)) <w-\v<ya) ^-w*))}- 

In the second case, we have t(Y k r ) = c k and we get 
{v-\r l (r(P k )))^-\r ] {riQ k )))} = 

to" W*)) t(Y^-\ v (Yl)), tp-^vtyl)) ttXDv-^vtXk)) *(n> _1 faM)}- 

Thus, if we put c k +i = h(Y£)t(Y£), we get 

Vfc+l) Q k +l} = \Y k +iC k +iY k+ i, ^fe+lCfe+l^fe+lCfe+l^fc+l} 

in all cases in view of Lemma [5.21 1). 
So, we have proved that 

{P k , Q k } = {Y k c k Y k , Y k c k Y k c k Y k } and V k ~ P k ^ Q k 

for each k = 2, . . . ,£(Y), and |Anc(F)| < 2. Moreover, we have shown that 
V k is uniform, the words V k , P k , Q k , and Y k have no tails, h(V k ) = h(P k ) = 
h(Q k ) = h(Y k ), and t(V k ) = t(P k ) = t(Q k ) = t(Y k ) for all k < £{Y). 

Now consider the words Y m = Anc(F), Q m , P m , and V m , where m = £(Y). 
If Y m G {a, b}, then P m and Q m are odd-length letter-alternating words. 
Since V m ~ P m , by Observation 11.31 we have 

V m = P m = Y m c m Y m G {bab, aba} and Q m = Y m c m Y m c m Y m G {babab, ababa}. 

Obviously, we have £(V) = £{P) = £{Q) = m + 1. Note that t(Y m ) = Y m in 
this case. Thus, if we put Y m+1 = A, we get 

V m +i — P'tn+i = Y mJr ic m+ iY m+ i and Q m +x = 5 / ^+iC m _|_iY m+ iC m+ iY^ n+ i, 

where c m+i = h(Y m )t(Y m ) = Y m . In the subsequent considerations we put 
Y m+1 = A if Y m G {a, 6}. 

If lYml = 2, then the class \Y m c m Y m c m Y m \ rx contains no overlap-free words 
by Lemma [5.31 and the word Y m c m Y m is overlap-free. Hence, we have 

V m Pm YinC-mY-m and Q m Y m C m Y m C m Y m . 

Clearly, if Y m G {aa, bb}, then the words V m , P m , and Q m are not AB- whole 
whence £{V) = £(P) = £(Q) = m. In the case Y m G {ab,ba}, we have 
£{Q) = m, since the word Q is not AB- whole, and £(V) = £(P) — m + 1. 
So, in all cases we have V m = P m = Y m c m Y m) Q m = Y m c m Y m c m Y m , and 

£(Q)-l < £(Y) < £(Q) < £{P) = £(V) < £(Q)+1. 
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Hence, P k = Y k c k Y k and Q k = Y k c k Y k c k Y k for each k = 2, . . . , m, and we get 
P = YY and Q = YYY. 

Notice that the word V m = P m is uniform if £(V) = £(P) — m + 1. It is 
shown above that the words V k are uniform for all k < m = £{Y). Hence, 
the words V k are uniform for all k < £{V). 

Step 2. Consider the direct normal forms of the words Y, P, V, and Q 
and prove that 

V = N D (P) = N D (Y)N D (Y) and N D (Q) = N D (Y)N D (Y)N D (Y). 

Since Anc(P) = Anc(V) is an overlap-free word, the words P and Q share 
the same main direct normal series. Moreover, this series is the primary 
series of V by Lemma 14.41 

Let {Ffcjf^i and {QkYk=i ^ e the direct normal series of the words Y and 
Q respectively. Let m = £{Q) and put Y m = Y m = A if m = £(Y) + 1. Thus, 
we have 

Y m c m Y m Y m c m Y m and Q m Qm Y rn c m Y m c m Y m Y m c m Y m c m Y m . 

Now suppose that 

Vk+i = Yk+iCk+{Yk+i and Qk+i = Yk + iCk+iYk + iCk+iYk + \ 

for some k G [2; m— 1]. Prove that the same holds for k. From Step 1 we 
know that either hy[k] = A or ty [k] = A. First, assume that ty[k] = A. Then 
Ck+i = Ck = hy[k], and we get 

V k = h v [k]ip(V k+ i) = h Y [k](f(Y k+ ic k+1 Y k+ i) = 

h Y [k]ip(Y k+ i)ckC k ip(Yk +1 ) = Y k c k Y k 

and 

Qk = h Q [k](p(Qk+i) = h Y [k](f(Y k+l c k+1 Y k+l c k+1 Y k+l ) = 

h Y [k}ip(Y k+1 )c k c k ip{Y k+1 )c k c k ip(Yk +1 ) = Y k c k Y k c k Y k , 

as desired. Now assume that hy[k] = A. In this case, c k+ i = ty[k) = Ck, and 
we get 

V k = (p{Vk+i)t v [k) = (p{Y k+1 Ck+iY k+1 )t Y [k] = 

(p(Y k+1 )c k c k (p(Y k+1 )t Y [k] = Y k c k Y k 

and 

Qk = (p{Q k+ i)t Q [k) = ^(Y k+1 c k+1 Y k+1 Ck + iYk + i)t Y [k] = 

¥(Y k+1 )c k c k p(Y k+1 )c k c k Lp(Y k+1 )ty{k] = Y k c k Y k CkYk. 

So, we have proved that 

Vk = YkCkYk and Q k = Y k c k Y k c k Y k 
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for all k = 2, . . . , m. Finally, consider the words V and Q\ = N D (Q). From 
Step 1 we have £y[l] = c 2 = fty[l]. Thus, we get 

V = h v [l)<p(y 2 )t v [l] = h Y [l] V (Y 2 c 2 Y 2 )t Y [l] = 

h Y [l]<p(Y 2 )ty[l]hy[lMY 2 )t Y ll] = Y X Y X = N D (Y)N D (Y) 

and 

Q x = h Q [l}cp(Q 2 )t Q [l] = h Y [l]ip(Y 2 c 2 Y 2 c 2 Y 2 )t Y [l] = 

/ly[l]^(y 2 )ty[l]/ly[l]^(Y 2 )ty[l]/ly[l]^(y 2 )ty[l] = 

YlYiYl = N D (Y)N D (Y)N D (Y), 

as desired. 

Let W = N D (Q) and Y' = N D {Y). By Lemma EJ we have W ~ Q and 
W2 ~ <?2- Let {5y}' =0 be a linking (W 2 , Q 2 )-sequence. Then the sequence 
{S^ = /i(<5)</ , (5'j)^(Q)}j=o is an ^-linking (W, Q)-sequence by Lemma fL4l 2). 
From Proposition 11.81 it follows that h(S'j) = h(Q) and t(Sj) = t(Q) for all 
j < I. Hence, rj(S'j) = f(Sj) for each j = 0,1, ... ,1, and all pairs Sj) 
for j > 1 are good. 

Recall that (P,Q) = (Rj^jRj), where i = 7({Pj}™ =0 ). Suppose that i > 
1. Construct anew r-linking (V, f/)-sequence {i?'}y =0 , where n' = + 1 — z, 
as follows: 

P(, = V, R' 1+j = Sj for j = 0, . . . , Z, P[ +1+ — = Riior i = i + l,...,n. 

Obviously, 0({i$}£o) = /9({^}?=o) and 7 ({^}™lo) = 1 < 7(W£J, 
which contradicts the choice of the sequence {Pj}™ =0 . Hence, 7({Pi}™ =0 ) = 1 
whence P — V and Q — R±. Since P = YY and V = Y'Y', we conclude 
that K = Y' = N D (Y) and Q = W. So, we have V k = Y k c k Y k and W* = 
Y k c k Y k c k Y k for all < £{Y), where is the primary PV-series, and 

Y k = Y k for all k < £(Y). 

Moreover, since the word V k is uniform for each k — 1, . . . ,£(Y) — 1, the 
word Y k is uniform for each k — 1, . . . , £(Y)— 1 as well. In view of Proposi- 
tion [L2] and Lemma [5.21 7), all words W k are uniform for k < £{Y) as well. 
Note that if £{Y) = £{W)-l, then Anc(F) G {a, b}. So, the words V k and 
W k are uniform for all k < £(W) even if £{Y) = £(W)-1. 

This completes the proof of statements 5-7 of the lemma. Additionally, 
we have proved statements 2-4 of the lemma for the words V, W, and Y. It 
remains to establish a connection between the words W and U. 

Step 3. On this step, we prove that (3({Ri}f =0 ) = 1, i. e., the pair (V, W) 
is the only bad pair of neighbours. Assume the converse. Let (P, Q) = 
(R-^Rj), where i > 1, be the bad pair such that all pairs (Pj_i,Pj) are 
good for 1 < i < i. 

By Lemma [5. 1\ there exists a word X such that {P, Q} = {XXX, XX}. 

Let {Pk}k=i> {Qk}k=i> an d {XkYk=i be the primary P-, Q-, and X-series 
respectively. Obviously, Pi — P, Qi — Q, and X\ = X. One can easily prove 
(see Step 1) that {P 2 ,Q 2 } = {X 2 d 2 X 2 , X 2 d 2 X 2 d 2 X 2 }, where d 2 = t(X), and 
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W 2 ~ P 2 ^ Qi- Moreover, we have h(X) = h(P) = h(Q) = h{W) and 
t(X) = t(P) = t(Q) = t(W) whence d 2 = c 2 . Now suppose that 

{Pk, Qk} = {X k c k X k , X k c k X k c k X k } and W k ~ P k ^ Q k (5.2) 

for some k G [2; £(Y)— 1], and prove that the same holds for k + l. 

First, we show that \X k \ > 2. Indeed, if \X k \ = 2, then the word X k c k X k 
is overlap-free, [X k c k X k ] ri = X k c k X k , and X k c k X k c k X k G S 2 . Note that all 
words from S 2 are not AB-whole. Since the word W k is AB-who\e, but it is 
not overlap- free, we get W k ^ X k c k X k and W k ^ X k c k X k c k X k , contradicting 
( 15.21) . In the case \X k \ = 1, the words P k and Q k are letter-alternating. Given 
W k ~ P k and V k < W k , we conclude that the words W k and V k are letter- 
alternating as well. According to Lemma 15.2^ 3), the length of W k and V k 
is odd. The inequality \Y k \ > 2 yields \V k \, \ W k \ > 5 whence V k ~ W k , 
contradicting to the condition that (V, W) is a bad pair. 

So, we have \X k \ > 2 whence \P k \ > 2 and \Q k \ > 2. Since P k ~ W k 
and the^word W k is uniform, Lemma 15.21 implies that both words P k and 
Q k are Ai?-whole and contain neither non-uniform nor non-reducible tails. 
Moreover, by Proposition 11.41 and Lemma [5 .2\ 7) we have 

{r(P k ),r(Q k )} = {r{X k )c k r{X k ),r{X k )c k r{X k )c k r{X k )} 

and W k ~ r(P k ) ^ r{Q k ). Finally, the same argument as in Step 1 gives 

{Pk+l, Qk+l} = {X k+ iC k+ iX k+ i, X k+ \C k+ iX k+ iC k+ iX k+ i} 

and W k+ i ~ P k+1 rjL Q k+1 . Also, we get 

h(X k ) = h(Q k ) = h(P k ) = h(W k ) and t(X k ) = t(Q k ) = t(P k ) = t(W k ). 

So, by induction on k we have proved that 

{Pk,Qk} = {X k c k X k , X k c k X k c k X k } and W k ~ P k ^ Q k 

for each k — 2, . . . ,^(X). Let m = £(Y). Consider the words X m , P m , and 
Q m . If Y m = aa or Y m = bb, then the word P m has the form (aab)*(aab) 2 aa 
or respectively {bba)* {bba) 2 bb by Lemma [5.31 Clearly, A m = (aab) l aa (resp., 
(bba) l bb), where / > 0. If / > 0, then the words P m and Q m are equivalent, 
which is impossible. Hence, / = 0, X m = Y m , and {P m , Qm} = {W m , V m }. 

In the case Y m = ba or Y m = ab, the word P m has the form (abc m )*(abc m ) 2 ab 
or (bac m )* (bac m ) 2 ba respectively. One can easily check that X m = (abc m ) l ab 
(resp., (bac m ) l ba), where / > 0. Again, if I > 0, then P m ~ Q m , which is 
impossible. Thus, we have / = 0, X m = Y m , and {P m , Q m } = {W m , V m }. 

Finally, if Y m = a or Y m = b, then the word P m is letter-alternating. By 
Lemma I5.2[ 3), the word Q m is letter-alternating as well and the lengths 
of the words X m , P m , and Q m are odd. The inequality \X m \ > 2 yields 
\Pm\i \Qm\ > 5 whence P m ~ Q m , a contradiction. Hence, \X m \ = 1, and we 
get X m = Y m and {P m , Q m } = {W m , V m } again. 

So, in all cases we get {P m , Q m } = {W m , V m }. Since P m ~ W m ^ V m , we 
conclude that P m = W m and Q m = V m . Hence, we have 

£(P) = £(W), £(Q)=£(V) 1 £(X) = £(Y), 
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and 



Anc(P) = Anc(W), Anc(Q) = Anc(V), Anc(X) = Anc(F). 

Obviously, the primary W-, V-, and Y-series are the direct normal series 
of the words P, Q, and X respectively. In particular, we have Qi ~ V2 
and Q ~ V. Let {Sj} l j =0 be a linking (V2, Q2) -sequence. Then the sequence 
{S'j = /i(l / )<^(S' : ,)t(l / )}j =0 appears to be an r-linking (V, Q)-sequence such 
that 77(5''-) = (p(Sj) for each j = 0,...,/. Hence, all pairs (Sj_i,Sj) for 
j > 1 are good. Now, if we replace the subsequence _R , . . . , Rj from the r- 
linking (V, f/)-sequence {Ri}f =0 by the sequence {Sj}j =0 , then we get a new 
r-linking (V, f/)-sequence with lesser values of (3 than f3({Ri}" =0 ). We get a 
contradiction with the choice of {Ri}" =0 . Hence the pair (V, W) is a unique 
bad pair among Ri) | 1 < i < n}, as desired. 

Step 4. Since all pairs for i > 1 are good, we conclude that 

the pair (W 7 , U) is good. In addition, the function r\ deletes exactly one 
letter from the word Wk for each k = 1, . . . ,£(W) — 1. By Proposition 15.11 
condition (**) from Lemma [3.21 holds for the pair (W, U). The rest of the 
lemma now follows from Lemma 13.21 □ 

6 Algorithm EqAOF 

In this section we complete the construction of Algorithm EqAOF. We 
start with the description of the second main procedure, called Normalize 
(procedure Ancestor is introduced in Sect. [3]). 

Procedure Normalize. 

Input. A word W ~ Anc(£7), the arrays L, R, h, t from procedure Ancestor (U), 

and the number m = £(U). 

Output. A word Norm(?7, W). 

Step 1. If m = 1, then return W, stop. 

Step 2. Let m :— m — 1; W :— h[m)<p(W)t[m). 

Step 3. If W = Y 3 for some Y G E+, then set W := Y 2 . 

Step 4. Let W := L[m}W R[m}; goto step 1. 

End. 

Let S = Si U iS> 2 U {a, b, aa, bb, ab, ba}. We ready to construct Algorithm 
EqAOF. 

Algorithm EqAOF. 

Input. An arbitrary word U. 

Output. An almost overlap-free word that is equivalent to U or "FALSE" if 
no such almost overlap-free word exists. 

Step 1. Run Ancestor([7) to get Anc([7), the arrays L, R, h, t, and the 
number m = £(U). 

Step 2. Find W G S such that Anc([7) ~ W; if no such word W exists, then 
return 'FALSE" and stop. 

Step 3. Run Normalize^, L, R, h, t, m); Let V := Norm(£7, W). 

Step 4. If V is almost overlap-free, then return V else return 'FALSE" . 

End. 

The next lemma ensures that Algorithm EqAOF works correctly. 
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Lemma 6.1. For a word U, EqAOF (?7) = V if there exists an almost 
overlap-free word V ^ {aaa, bbb} such that V ~ U , and EqA0F([7) = FALSE 
otherwise. 

Proof. Obviously, if EqAOF([7) = V, then V is an ri-reduced almost 
overlap-free word and U ~ V . Conversely, suppose that U ~ V, where 

V is an almost overlap-free word such that V <^L {aaa, bbb}. Prove that 
EqAOF(tf) = V. 

Let {Uf-YjjEj \ {^4}i=i be the primary U- and ^-series respectively, and let 
m = min{£(U) , £(V)} . If condition (**) from Lemma [3.21 holds, then £(U) = 
£{V) and U k ~ V k for all k < £{V). In particular, we get Anc(U) ~ Anc(V). 
By Lemma l3TT| 1), we have Anc(V) G S. Thus, Algorithm EqAOF puts W = 
Anc(V) on Step 2. Since all words V k for k < £(V) are almost overlap-free, 
procedure Normalize constructs the main normal ^/-series, which coincides 
with the primary ^/-series by Lemma 14.41 So, Norm([/, V) = N(U) = V. 
Hence, EqAOF([/) = V, as desired. 

Now suppose that there exists an integer k! < m such that all pairs 
(r(rr(f/fc)), r(rr(Vfc))) for k < k' are good and the pair (r(rr(f/fc')), rij^Vk'))) 
is bad. From the proof of Lemma 13.21 it follows that Uf. ~ V k f° r a U 
k < k'. Moreover, we get Lu[k) = Ly[h], Ru[k) = Rv[k], hjj[k] = hy[k], 
and t(j[k) = ty[k] for each k = 1, . . . , k'. Let us denote the words r{r T {Uk')) 
and r(rr(Vfc/)) by U' and V respectively. By Lemma [5T4] there exists a word 

Y such that V = YY, £{U') = £(W), and Anc(f/') ~ Anc(W) G S 2 , where 
W = YYY. Let {W^}^ and {Y k } e £2 be the primary W- and F-series 
respectively. Since Anc(f/') = Anc(f/), Algorithm EqAOF, running on U, 
chooses W = Anc(W') on Step 2. 

According to Lemma 15.41 the iy-normal series of U' coincides with the 
primary W-series. Since {W^l = \Y m c m Y m c m Y m \ = 3\Y m \ +2, the word W' m 
is not a cube for each m = 2, . . . , £{W'). Hence, the condition on Step 3 of 
procedure Normalize is not fulfilled, and procedure Normalize, running on W , 
constructs the ly-normal series of U', while m = £(U), . . . , £{U)— £{U')+2. 

On the iteration with m = £(U)—£(U')-\-l = k' we obtain the word 
W = YYY. On Step 3 of this iteration, procedure Normalize reduces 
W to the word V = YY. Since the word Vy is almost overlap-free, we 
get V = TT(Vfc'). Hence, on Step 4 we obtain the word Lu[k'}V Rjj[k'] = 
Ly[k']rT{Vk')Rv[k'} = V k <. After that procedure Normalize restores all words 
V m for m = k'—l, . . . , 1 and returns the word Norm(?7, W) = V. So, Algo- 
rithm EqAOF returns V, as desired. □ 

The next lemma estimates the time complexity of Algorithm EqAOF. 

Lemma 6.2. Algorithm EqAOF has 0(\U\) time complexity, where U is 
an input word. 

Proof. Step 1 runs procedure Ancestor. The cycle in Ancestor consists 
of constant-time and linear-time operations (checks and reductions). The 
nontrivial check in Step 1 of Ancestor is linear, because two given classes are 
recognizable languages. So, if the cycle in Ancestor is bounded by C\U\ + 
D for any given word U, then the complexity of Ancestor is bounded by 
£H(C|£4| + D). Since \U k \ < 2- k+1 \U\ and £{U) < \og\U\, we conclude 
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that the time complexity of Ancestor is bounded by 

oo 

c \ u \ E # + D log |f/| = 2Clul + D log |f/| - 

fc=0 

Hence, procedure Ancestor runs in 0(|£/|) time. 

As to the complexity of Step 2 of Algorithm EqAOF, from Lemmas 11.31 
and !5.3l it follows that the equivalence class of any word of S is a recognizable 
language. Thus, the word Anc(C7) is examined by a finite set of fixed finite 
automata, and the complexity of this step is linear with respect to |Anc(C/)|. 
Procedure Normalize, applied on Step 3, runs in 0(|[/|) time by the same 
reason as procedure Ancestor. Finally, a word can be checked for almost 
overlap-freeness in linear time also. The corresponding algorithm can be 
constructed, for example, by modifying Algorithm A' from [T3]. We see that 
Algorithm EqAOF has 0(|£/|) time complexity. □ 

Lemmas 16.11 and 16.21 together prove Theorem [2j 
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